Skip to content

Conversation

tradeqvest
Copy link
Contributor

@tradeqvest tradeqvest commented Sep 22, 2025

Fix parallel tool call limit enforcement

Problem

The tool_calls_limit in UsageLimits was not properly enforced for parallel tool execution. When multiple tools were called in parallel, all tools would start executing before the limit was checked, allowing the limit to be exceeded before raising UsageLimitExceeded.

For example, if tool_calls_limit=6 and the model returned 8 parallel tool calls, all 8 tools would start executing before the error was raised.

Solution

This PR modifies the parallel tool execution logic in _agent_graph.py to enforce the limit before starting tool tasks:

  1. Pre-execution limit check: Before creating async tasks for parallel tools, we now check how many tool calls are remaining within the limit
  2. Immediate error: Raise UsageLimitExceeded if the requested tool amount would violate the usage limit

@tradeqvest tradeqvest marked this pull request as ready for review September 22, 2025 13:10
@DouweM
Copy link
Collaborator

DouweM commented Sep 26, 2025

@tradeqvest Since we'll raise an error regardless, is it worth executing tool calls up to the limit, instead of just raising immediately?

I was thinking we could do something like this:

usage = ctx.state.usage
if ctx.deps.usage_limits.count_tokens_before_request:
# Copy to avoid modifying the original usage object with the counted usage
usage = deepcopy(usage)
counted_usage = await ctx.deps.model.count_tokens(message_history, model_settings, model_request_parameters)
usage.incr(counted_usage)
ctx.deps.usage_limits.check_before_request(usage)

Where we optimistically increment a copied version of the usage, and check the usage limit against that.

@tradeqvest
Copy link
Contributor Author

tradeqvest commented Sep 26, 2025

@tradeqvest Since we'll raise an error regardless, is it worth executing tool calls up to the limit, instead of just raising immediately?

I was thinking we could do something like this:

usage = ctx.state.usage
if ctx.deps.usage_limits.count_tokens_before_request:
# Copy to avoid modifying the original usage object with the counted usage
usage = deepcopy(usage)
counted_usage = await ctx.deps.model.count_tokens(message_history, model_settings, model_request_parameters)
usage.incr(counted_usage)
ctx.deps.usage_limits.check_before_request(usage)

Where we optimistically increment a copied version of the usage, and check the usage limit against that.

@DouweM It would definitely be simpler, yet I was thinking that the tool output up until the UsageLimit violation could still be of value, captured and further processed. Let me know what you think.

@DouweM
Copy link
Collaborator

DouweM commented Sep 29, 2025

@tradeqvest I think it'd be misleading if those results never get sent back to the model to use, and the user will think their action failed even though some tools (with side effects) may have in fact been executed. If we had a way to, instead of failing hard, tell the model "this call was not executed because you hit the limit" for the calls over the limit, executing the earlier ones makes sense, but until we have such a mode I'd rather not run the tools at all.

@tradeqvest tradeqvest force-pushed the fix-tool-call-limit branch 3 times, most recently from 9a459ab to f053b90 Compare October 2, 2025 08:01
@tradeqvest
Copy link
Contributor Author

@DouweM Good point! I've adapted the implementation to use the fail-fast approach you suggested.
Now it checks upfront whether the projected total would exceed the limit and raises immediately without executing any tools from that batch. Thanks for the feedback!

- Removed the unused `parts` variable in `UserPromptNode`.
- Inline limit check in _call_tools instead of separate function
- Pass usage directly as parameter rather than extracting from tool_manager
- Remove redundant per-tool check in ToolManager
- Align error message format with other usage limit errors
- Simplified the tool call logic by removing the unused usage_limits parameter from the _call_tool method in ToolManager.
@tradeqvest tradeqvest force-pushed the fix-tool-call-limit branch from 127eee7 to 6018ae1 Compare October 3, 2025 20:40
@tradeqvest tradeqvest requested a review from DouweM October 3, 2025 20:54
@tradeqvest tradeqvest changed the title fix: enforce tool call limit enforcement for parallel tool calls fix: enforce tool call limits for parallel tool calls Oct 3, 2025
@DouweM DouweM changed the title fix: enforce tool call limits for parallel tool calls Fix parallel tool call limit enforcement Oct 3, 2025
@DouweM DouweM enabled auto-merge (squash) October 3, 2025 22:06
@tradeqvest tradeqvest requested a review from DouweM October 3, 2025 22:07
@DouweM DouweM merged commit 3b8ff2c into pydantic:main Oct 3, 2025
29 checks passed
@DouweM
Copy link
Collaborator

DouweM commented Oct 3, 2025

@tradeqvest Thanks Niko!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants